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We investigate the prospects for constraining alternative theories of gravity with a typical near- 
term low-budget 21 cm intensity mapping experiment. We derive the 21 cm brightness temperature 
perturbation consistently in linear theory including all line-of-sight and relativistic effects. We 
uncover new terms that are a small correction on large scales, analogous to those recently found in 
the context of galaxy surveys. We then perform a Fisher matrix analysis of the Bq parametrization 
of f(R) gravity, where Bo is proportional to the square of Compton wavelength of the scalaron. We 
find that our 21 cm survey, in combination with CMB information from Planck, will be able to place 
a 95% upper limit of 7 x 10 -5 on Bo in fiat models with a ACDM expansion history, improving 
on current cosmological constraints by several orders of magnitude. We argue that this constraint 
is limited by our ability to model the mildly non-linear regime of structure formation in modified 
O ■ gravity. We also perform a model-independent principal component analysis on the free functions 

' introduced into the field equations by modified gravity, /i and E. We find that 20-30 modes of 

the free functions will be 'well-constrained' by our combination of observables, the lower and upper 
limits dependent on the criteria used to define the 'goodness' of the constraint. Our analysis reveals 
that our observables are sensitive primarily to temporal variations in £ and scale variations in 
/j,. We argue that the inclusion of 21cm intensity maps will significantly improve constraints on 
any cosmological deviations from General Relativity in large-scale structure in a very cost-effective 
■ niaiuuT. 
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A major challenge in cosmology is understanding the late-time accelerated expansion of the Universe @, |]. If 
we assume that gravity is described by General Relativity (GR), such an expansion requires the existence of a new 
exotic form of energy with negative pressure, called dark energy. An alternative solution is to give up GR at large 
(cosmological) scales and introduce modifications to the laws of gravity (for a review, see ||). 
Whilst observations that measure the expansion history can rule out specific dark energy and modified-gravity 
^vq , models, there is usually sufficient freedom in the form of the modification to reproduce any desired expansion history, 
f* 1 **. ■ thus rendering these measurements unable to distinguish between a (possibly time-dependent) dark energy component 
and modified gravity [Q. To constrain these models, we must therefore go beyond the level of the background, and 
. study the perturbed Universe. It is here that an unclustered dark energy component may be distinguished from 
y—i ' modified gravity 1 . 

Current and future optical surveys focus mainly on two promising observables: galaxy number counts and cosmic 
shear. These observables are both sensitive to the distribution of matter in the Universe and therefore provide 
powerful probes of the laws of gravity. In this paper we concentrate on a complementary observable measured in radio 
surveys: the redshifted 21 cm emission line of neutral atomic hydrogen (HI). We are interested in the post-reionization 
emission that traces the distribution of hydrogen at low redshifts. The advantage is that after reionization, the physics 
of the 21cm power spectrum is simple and can be accurately modelled in linear theory [Q, |J. This is in contrast 
to observations of the epoch of reionization, where the 21cm power spectrum is very sensitive to the details of the 
reionization process || . 

The post-reionization signal is thought to be dominated by emission from dense self-shielded damped Lya sys- 
tems (DLAs) which form in high-density reg ions, with some of the signal originating from optically-thin Lya absorbers 
in low-density regions (for reviews, see [fl0| — [l2|[ ) . On large (linear) scales, it is expected that these sources trace out 
the underlying matter density field, up to a scale-independent bias factor if the underlying density field is Gaussian 
(for non-Gaussian corrections to this argument, see for example The 21cm post-reionization signal therefore 

provides a new window to observe the distribution of matter. Recently, measurements of the 21 cm cosmological signal 
at z = 0.8 have been made by cross-correlating with optical galaxies O, 15 . 
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ver that some modified-gravity models may still remain indistinguishable from a clustered dark energy model at the perturbed 
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The main advantage of the 21 cm method over standard optical surveys is that it does not require the resolution of 
individual galaxies: one detects the diffuse line radiation from a large number of sources. This technique, called 21 cm 
intensity mapping | jl6[ thus allows one to observe large volumes of the sky efficiently with low-resolution interferometers 
or single-dish experiments which can be constructed relatively cheaply Jl7| - [l9| . It also permits observation at higher 
redshifts than optical surveys, which are limited by the necessity to identify galaxies individually, a task that becomes 
increasingly difficult as galaxies appear fainter. A further advantage of observing the 21 cm line is that source redshifts 
are readily obtainable from the observation frequency. This allows one to perform a tomographic analysis with very 
thin redshift bins, and consequently to study the evolution and scale dependence of the underlying matter density 
field. Moreover, precise redshift information enhances the significance of the redshift-space distortion term, which is 
suppressed in optical surveys with only photometric redshifts by uncertainties in the redshifts. It follows that 21cm 
surveys are very well suited to cosmological parameter inference 20j], measurement of baryon acoustic oscillations 
(BAO) [^l[ ^2|, constraining the properties of dark energy ^3|, and constraining models of modified gravity p5| . 

Experimentally, observing at low redshifts (i.e. after reionization) has the advantage that the foreground contam- 
ination from Galactic synchrotron emission is significantly reduced compared to epoch-of-reionization experiments, 
since its amplitude scales roughly as v~ 2J (for a thorough investigation of foregrounds in the context of a single dish 
experiment, see fll7|). Although foreground removal will still present a significant challenge in recovering the cosmo- 
logical signal, this should be possible since the foreground components are smooth in frequency space, as opposed 
to the signal which has significant structure in the frequency (i.e. radial) direction due to the discrete nature of the 
underlying sources and their small-scale clustering. 

However, there is a price to pay for observing the Universe at high frequencies. Since most of the HI is destroyed 
during the epoch of reionization, the signal is significantly smaller at post-reionization redshifts. Although the 
foregrounds are also smaller as mentioned above, the system temperature of the detector is still significant due to the 
receiver noise floor. A typical antenna temperature for a near-term experiment is 50 K , while the sky temperature 
at low redshifts is typically 5-10 K pi. The system temperature at the receiver is thus dominated by the design of 
the antenna. A further disadvantage of working with the low redshift signal is that the typical co-moving length scale 
below which linear perturbation theory is expected to break down is larger than at high redshift, due to hierarchical 
structure formation. Since we do not attempt to model the non-linear 21cm signal in this work, the number of 
independent modes of the density field accessible to us is much smaller than at high redshift. However, even with 
these considerations, the 21cm intensity mapping method provides a potentially powerful tool for learning about late- 
time structure formation, at a fraction of the cost of spectroscopic galaxy surveys or high-redshift epoch-of-reionization 
21cm surveys such as those planned with the Square Kilometre Array (SKA). 

In this paper, we provide a consistent derivation of the perturbation to the 21 cm brightness temperature in linear 
theory, including all relativistic effects which could be relevant on large angular scales, analogous to the new terms 
found in the context of galaxy surveys in pq-p^|. Our forecasts of the potential of 21cm mapping for constraining 
modified-gravity models complements the work of Masui et al. j24|, which considered only information from the 
BAO signal and the reconstructed weak lensing signal in two specific models. In contrast, we include the full three- 
dimensional clustering information in angular and redshift space provided by the intensity maps and consider more 
general modifications to gravity. We try to keep our experimental setup as general as possible, but choose a redshift 
range of 0.7 < z < 2.5, appropriate to that of the Canadian Hydrogen Intensity Mapping Experiment (CHIME) [5j| . 

The plan of this paper is as follows. In Sec. [Il] we derive the mean 21 cm brightness temperature and its perturbation 
consistently in linear theory, including all line-of-sight and relativistic effects. In Sec. Ill we discuss models of 
modified gravity, and the response of the 21 cm signal to deviations from GR. We also set up the parametrizations of 
modifications to gravity there which are used for our forecasts. In Sec. [V we discuss 21cm intensity mapping and 
the experimental setup assumed, before describing our statistical methods and presenting our results in Sec. [y|. We 



conclude in Sec. VI 



The cosmological parameters varied in our statistical analyses are (flbh 2 , 0. c h 2 , r, h, A s , n 8 , fl u h 2 ), with fiducial 
values of (0.0223, 0.112, 0.085, 0.73, 2.04 x 10~ 9 , 0.966, 0.0006) and pivot scale 0.05 Mpc -1 for the primordial power 
spectrum. The background expansion history is always that of ACDM models, and we assume three neutrino species 
with degenerate masses. Our metric convention is +, — , — , — and, unless stated otherwise, we take c = 1. 



II. THE 21 CM BRIGHTNESS TEMPERATURE 



In this section we derive the perturbation to the 21cm brightness temperature in linear theory, including all 
relativistic effects. Much of the material in this section is drawn from [^9| and Appendix A of |^7|. Except where 
stated explicitly, the results in this section are valid for all metric theories of gravity. 

Let the rest-frame (proper) number density of neutral hydrogen atoms at redshift z along some line of sight be 
riHij with a fraction rti/nni being in the excited triplet states and no/nni in the singlet state of the 21 cm hypcrfinc 
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transition. Neglecting the finite line width of the emission, which should be fine on the large scales we consider, in 
the rest-frame of the gas the net change in the number of photons per volume with energy E and E + dE propagating 
within a solid angle dil in proper time dt due to 21cm interactions is 

drwt = — [(m - 3n )AA 7 + m] A 10 S(E - E 21 )dEdtdSl, (1) 

Here, J\f 1 is the photon occupation number, A\q ss 2.869 x 10~ 15 s _1 is the spontaneous emission coefficient |3(J, and 
E21 = 5.88 fieV is the rest-frame energy of a 21 cm photon. In Eq. ([!]), we have assumed that the atomic triplet states 
are populated isotropically (see j29j for further discussion of this point). 

The level populations define the spin temperature T s by hi/hq = 3e~ T21 / Ts , where T 2 \ = E^x/Hb = 0.068K and 
ks is Boltzmann's constant. We assume that the radiation field at the relevant frequencies consists of the CMB 
blackbody with temperature Tcmb 3> T 2 i and the additional photons from the 21cm interaction. At low redshift, 
T s Tcmb [0 since the spin temperature is coupled to the gas temperature by Lya photons, and stimulated emission 
and absorption can be neglected 2 . In this limit, Eq. ([!]) becomes 

dn cmit pa r^n m A w S(E - E 21 )dEdtdQ . (2) 

Note that this result is independent of the spin temperature. 

Since re-absorption is negligible, and neglecting Thomson scattering of the anisotropics in the line radiation (the 
Thomson optical depth to z — 2 is 0.008), we can calculate the brightness temperature by simply summing up the 
emitted photons. These are received by an observer with 4-velocity u a along a line-of-sight direction n. In terms of 
the photon distribution function f(E, n), the number of photons collected with energies between E and E + dE in 
an area dA subtending a solid angle at the observer of dfi in proper time dt is 

dn roc = f(E, n)E 2 dEdfldAdt. (3) 

We relate dn rec and dn om it by considering the propagation of the congruence of null geodesies that focus at the 
observer. Let dA be the invariant area of the wavefront at affine parameter A corresponding to some source position. 
In an interval of affine parameter dA, the wavefront sweeps out a volume dAu^kadX, where it" is the source 4-velocity 
and k a is the wavevector, k a — dx a /d\. If the collecting area of the detector, dA, subtends a solid angle dfi at the 
source in its rest-frame, then photons in an (observer-frame) energy range dE around energy E will be collected in 
time dt in solid angle dQ with number 
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dEdtdhdA, (4) 



where the integral in the first equation is along the line of sight, and in the second the quantities are evaluated at 
redshift z along the line of sight, where 1 + z = E 2 \JE. (Additionally, we used k a u a B = E 2 \.) Using the reciprocity 
relation dAdfi = dAdf2/(l + z) 2 , comparison with Eq. @ gives 

dA 
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dz 



(5) 
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Finally, the 21cm brightness temperature T\> is related to the photon distribution function by fc^Tt = h 3 Ef/2, where 
h is Planck's constant. It follows that 



rr, , 3 h 3 n m A 10 

T b (z,n) 



dA 

dz 



(6) 



32tt k B E 21 

If we initially ignore perturbations, |dz/dA| = (1 + z)H(z)E 2 i (where H(z) is the Hubble parameter) and the 
background brightness temperature is (reinstating factors of c) 

3(/ic) 3 n H i-4io 



32nk B E 2 1 (l + z)H(z) 
= 0.188K/tr>Hi(z) (1 J:y , (7) 



2 This follows for the CMB photons since T B Tcmb- For the additional 21cm photons, the optical depth for re-absorption can be 
shown to be O(10 — 2 )T2i/T s 1 around 2 = 1, and is therefore negligible. 
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where E(z) — H(z)/Hq, the Hubble constant Hq — lOOh kms _1 Mpc _1 , and Qm{z) is the comoving mass density 
in HI in units of the current critical density. Where required in this work, we take fini = 4 x 10~ 4 to be constant, 
consistent with the local value found in the HIPASS survey . The value of flm is not expected to vary significantly 
over the redshift range we consider (3^] . 

In the presence of perturbations, in Eq. (^|) we must evaluate the perturbed nni at the perturbed position appropriate 
to the redshift z and line of sight n, and include the perturbation in |dz/dA|. We will work in the conformal Newtonian 
gauge, where the metric takes the form 

ds 2 = a 2 (ri) [(1 + 2*)cfy 2 - (1 - 2<S>)S ij dx i dx j ] . (8) 

Here, r\ is conformal time, a(rj) is the scale factor and the spacetime-dependent gravitational potentials are ^ and $. 

An observer at rest in this coordinate system has 4-velocity = a _1 (l — v I')(Sq, which is parallel to d„. We can 
equip this observer with an orthonormal tetrad of 4-vectors, (e Q ) a such that (eo) a = u a and e s ; = a _1 (l + 
Decomposing the wavevector onto the tetrad gives 

^ = (l + 4> + *)e, ^ = a- 2 e(l - *), (9) 

where the comoving energy e is defined by k a u a = e/a, and the unit 3- vector e consists of the spatial tetrad components 
of the propagation direction. In addition we have the geodesic equations for the photons 

de de d*f? • . . 

- = -Va.(* + *) > - = - e _ +e (* + *), (10) 

where Vj_ = V — ee • V. The derivatives here are along the line of sight and overdots denote partial derivatives with 
respect to rj. 

The source velocity (i.e. the bulk velocity of the HI) may be written as u a s = u a + v a such that = a _1 [l — 'i', v l ] 
where v % are the spatial tetrad components of v a . Similarly, the 4-velocity of an observer at position A moving with 
3- velocity v % oA in the conformal- Newtonian frame can be written u^ A = a A [1 — ~*5>AtV 1 oA \. The redshift of the source 
measured by this observer at A is then 

i + 2 - = £d_l(i + fi.[ v _ VaA ]), (n) 

a e A 

where n = e is the line-of-sight direction of the photons as seen by the observer. (Aberration and evolution of e can 
be neglected to the order we are working.) Integrating Eq. ( |i~0| ) gives the ratio of Newtonian-gauge energies 



and so the redshift is 



+ (12) 



I] A 



l + z = -^(l + y A -y+ /'(* + *) dr/ + n • [v - v oA ] ) . (13) 



a{rf) 



>)A 



Writing the perturbed conformal time at redshift z along the line of sight n as T)(p., z) = r/ z + 5rj, where r/ z is the 
unperturbed value, it follows from Eq. (|l^) that 

H{7) z )5ti = ^a-^+ /"'(* + *) dr/ + n- [w-w oA ], (14) 

where all quantities on the right-hand-side are evaluated on the zero-order lightcone. Here, % = a/a is the conformal 
Hubble parameter. 

Writing the perturbed nni as nm(l + fi n ) and evaluating at ?/(n, z), the density term in Eq. (^) becomes Hhi(%)[1 + 
fin + (nm/nm)fiv\- For |dA/dz|, we differentiate Eq. (|l3| ) and use Eq. (||); evaluating at the perturbed conformal time 
gives 



dA 



(z,n) - 



n(fj z )E 21 (i + z) 



(15) 
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Combining these results, and using H — aH, gives the background brightness temperature in Eq. (Q). The fractional 
perturbation to the brightness temperature is 3 

A Tb (z, n) = 5 n + *™8r, - f % - U J S V - 1^ + 1($ + *) + in • ^ + n • v + *. (16) 
nm V ft / ft drj ft ft dij 

If the comoving number density of HI is conserved at low redshift (i.e. the ionized fraction of hydrogen is constant), 
hni/n~m — — 3iH- Using the Euler equation v + ftV + = 0, 4 some further cancellations occur leaving 

A Tb (*, n) = 6 n - in • (n • Vv) + - \ - 2h\ Srj + i* + *. (17) 

Equation (^), new to this work, is the expression we have been seeking for the 21 cm brightness temperature pertur- 
bation in the limit of zero frequency bandwidth. It is straightforward to show that it agrees with the more complete 
expression [Eq. (18)] in [ p9| in the limit T s 3> Tqmb and for low optical depth for re-absorption and Thomson 
scattering. 

Each of the terms on the right of Eq. ( p7| ) has a simple physical explanation. The first two terms are the usual density 
and redshift-space distortion terms. The third term comes from evaluating the zero-order brightness temperature 
at the perturbed time corresponding to the observed redshift. The time perturbation given in Eq. ( |l4| ) contains 
integrated Sachs- Wolfe (ISW), potential and Doppler terms. The fourth term arises from the part of the ISW term in 
Equation ( p"3| ) that is not cancelled by the velocity evolution introduced via the Euler equation. The final potential 
term comes from the conversion between increments in redshift and radial distance in the gas frame. 

The perturbation to the brightness temperature depends only on the observer velocity through Srj. Isolating these 
terms gives 

d In T 

AVl A {z,h) = h-v oA + n-v oA (l + z)——±. (18) 

We compare this dependence with the exact result for the variation in the brightness temperature under a change 
of frame: (1 + z')T£(z' , h') = (1 + z)Tf>(z, n) which follows from the invariance of the distribution function. Since 
1 + z = (1 + z')(l + n • v rcl ) to linear order, where v rcl is the relative velocity of the two observers, Taylor expanding 
the redshift dependence of Tj,(z, n) about z' we have 

d In T 

A^(z',n') w A Tt n) + n • v rel + n ■ v rcl (l + z')— [^A (19) 

which is clearly consistent with the velocity dependence in Eq. (|T§1). 

The large-scale clustering of the HI gas follows that of the discrete sources (e.g. DLAs) housing the neutral gas. We 
assume linear scale-independent bias of these sources in the comoving gauge, following the arguments in p7f . During 
matter domination, the comoving gauge coincides with the synchronous gauge, and we have 

6 n = uT+{^^-m)l (20) 

where v is the Newtonian-gauge matter/gas velocity with v = — fc _1 Vu, and S^ n is the synchronous-gauge matter 
overdensity. 

Equation (|17|) c an be compared with the related expression for the differential number counts of discrete objects 
detailed in ]26| |28|, |33| |. For simplicity, consider counting the HI atoms per solid angle and per redshift n^f(z, n). The 
brightness temperature, Eq. (g), may be rewritten as 

lb[Z ' U) ~ 32irk B El deW ' [2L) 



3 We retain the local dipole term at the observer for completeness, but, since this is observer-dependent at linear order, we only consider 
multipoles I > 2 in our later forecasts. 

4 Note that this equation is still valid in the modified-gravity models we arc interested in. 
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where detD is the determinant of the Jacobi map in the observer's frame H] which relates the (Lorentz-invariant) area 
of a bundle of light rays to the solid angle at the observer. The determinant of the Jacobi map is related to the square 
of the luminosity distance dh through reciprocity: d\ = (1 + ;z) 4 dctl?. The presence of d\ in Eq. ( pi] ) makes intuitive 
sense since we measure surface brightness (i.e. angular density of sources multiplied by flux) in 21 cm intensity mapping 
experiments rather than counting objects. At linear order the fractional perturbation to the brightness temperature 
At 6 (z, n) is then related to the HI angular number density fluctuation A(z, n) (calculated following the galaxy counts 
calculation in |§-|i§ ||) b y 

A Tb (z,n)=A(z,n)-2^f^, (22) 

where is the background luminosity distance and SdL denotes its first-order perturbation. The brightness tem- 
perature is therefore a combination of angular number density fluctuations and luminosity distance fluctuations. In 
particular, Eq. ( p2| ) tells us why the brightness temperature has no magnification term at linear order. The perturbed 
luminosity distance at fixed observed redshift indeed contains a lensing convergence term due to transverse distortion 



of the volume element at the source [35 . However, so docs the differential number count since we observe the density 
of a quantity which thus involves the perturbed volume element. These two effects exactly cancel at linear order - 
magnified sources give a greater observed flux but have reduced angular density on the sky - and lensing therefore 
conserves surface brightness. Hence, as for the CMB, gravitational lensing has no effect at first order to the 21cm 
brightness temperature: the lensed image of a smooth sky is a smooth sky 5 . 

The new terms that have been uncovered by this self-consistent linear analysis are expected to be negligible on all 
but the largest scales and highest redshifts (see Q for a thorough investigation in the context of galaxy surveys). 
The only new aspect of our application is the narrow window function in redshift-space which enhances the relative 
importance of the redshift-space distortion term. Considering fluctuations at a given small angular scale l/l, if the 
redshift window function is broad compared to the typical linear scale x( z )/l of the contributing perturbations (where 
x( z ) is comoving distance back to redshift z), we are in the 'Limber' regime and the redshift distortions oc — n- (n- Vv) 
will tend to cancel out on integrating through the window. Normalising the window function to unity to keep the 
integrated background Tb almost constant, the power from redshift-space distortions falls as (Ax) -2 , where A% is 
the width of the window function, on small scales. In the Limber limit, the power from the integrated density term 
falls more slowly - as (A%) -1 - since the fluctuation power accumulates as the number of incoherent patches within 
the window. However, since the redshift window is very narrow for 21cm mapping, only very small scales are in the 



Limber regime, thus significantly enhancing the relative power in redshift-space distortions, as seen in Fig. 6 of [26 



In Fig. pi we plot the auto-spectra (see Sec. [I A for details) of each of the terms in Eq. (17) individually, including 
those contained in Srj, for a bandwidth of 2 MHz at z = 1. Clearly, the density and redshift-space distortion terms are 
dominant on all angular scales at this redshift. Note that super-Hubble effects are generally suppressed in the 21cm 
power spectrum since the signal on large angular scales is dominated not by modes at the corresponding linear scale, 
but by smaller sub- Hubble scale modes . This is because the dimensionless power spectrum of 5 n grows rapidly 
as k on scales smaller than Hubble length (but larger than the horizon size at matter-radiation equality). The 21 cm 
signal is therefore like white noise, C; = const., on angular scales large compared to the angle subtended by the peak 
in the matter power spectrum. 

The fractional error in the power spectrum if only the density 6 and redshift-space distortion terms are retained is 
shown in Fig. || at z — 1 for various widths of redshift window. We see that the relative importance of relativistic 
effects increases as the bin size increases, consistent with the results of p6fl . This arises from the dominant white-noise 
contribution of small scale modes at a given I being suppressed by cancellations through the width of the window, 
but the contribution of large-scale modes, where the relativistic terms are relevant, not being suppressed. At the 
low redshifts considered in this work, the relativistic terms represent only sub-percent corrections to the large-scale 
angular power, where cosmic variance is large. Moreover, their effect is small compared to astrophysical modelling 
uncertainties, for example in the bias. However, we include the relativistic terms in our forecasts for consistency. 



5 Note, however, that at second order 21cm intensity mapping can be used to reconstruct the lensing power spectrum |24| . 

6 In the remainder of this work, 'density' refers to the Newtonian-gauge density of Eq. (|^). 
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FIG. 1: (Color Online). Fractional brightness temperature perturbation power spectrum at z = 1 with a 2 MHz bandwidth. 
The auto-spectra of the full signal (black, dashed) and of each individual term in Eq. ( [l7| ) are shown, generically grouped (solid 
lines, top to bottom respectively) as Newtonian-gauge density (red), redshift-space distortions (green), velocity terms (blue), 
all potential terms evaluated at the source position (cyan) and the ISW term (magenta). 



A. Angular power spectra 

After a Fourier transform and expansion into angular multipoles we have the brightness temperature perturbation 
in k and I space. For example, the first two terms in Eq. ([l7]) give 



d 3 k 

(2tt) 3 / 2 



where 



A Tbtl (k,z)Y l * m (k), 



kv 



A Tb ,i(k,z) ps S n ji(kx) + 'jgji"(kx), 



(23) 



(24) 



with x the conformal distance. We integrate over a redshift (frequency) normalized window function W{z) to give 

y>OG 

Af a (k)= dzW(z)A Tb u(Kz). (25) 
Jo 



The angular cross-spectra between redshift windows is then calculated as 



a 



WW' 

I 



4tt / dlnkVn(k)A%,(k)A%,(k) 



,w' 



(26) 



where V-nik) is the dimensionless power spectrum of the primordial curvature perturbation 1Z, and the transfer 
functions Af bl (k) = A|£ j(k)/ft(k). 
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FIG. 2: (Color Online). Fractional error in the 21cm angular power spectrum at z = 1 if only density and redshift-space 
distortions terms are retained. The errors are plotted for window functions of bandwidth (top to bottom respectively) 2 MHz 
(red), 5 MHz (blue), 10 MHz (green), and 20 MHz (black). 



We calculate the cross-spectra using a modified version of the Boltzmann code CAMB SOURCES 7 . At the low 
redshifts we consider, some care must be taken when integrating over the narrow window function, and we found it 
necessary to run the code at accuracy_boost = 2 to ensure the window was well sampled. 



III. MODIFIED GRAVITY 



Several attempts have been made to explain the phenomenon of accelerating expansion through a modification to 
standard GR ||. These modifications must predict an expansion history close to that of ACDM, but generically 
predict different linear perturbation dynamics |ij |37j . It is therefore necessary to study the clustering of matter 
in order to distinguish between competing theories^ 

Amongst the most studied examples of modified gravity are scalar-tensor theories and the higher- derivative theory 
f(R), which can be mapped on to a scalar-tensor theory via a conformal transformation of the metric and a field 
redefinition |}q] . The action in a scalar-tensor theory takes the form 

+ S m {g^,^), (27) 



where M?\ = I/t/SttG is the reduced Planck mass, V is the potential for the scalar field </>, and ipm are the matter 



M 2 



-(V0) 2 -^) 



7 http://camb.info/sources. 

8 It should be borne in mind that none of these theories explains why the vacuum energy from particle physics is cancelled to such high 
precision on cosmological scales. 
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fields which couple to the conformally rescaled metric <? M „ where 

= e-^ M ^ 9flV) (28) 

where a{<p) is an arbitrary coupling function. The metric g^ v is the Einstein-frame metric, where the action in 
Eq. (p^ ) looks like the standard Einstein-Hilbert action but with matter non-minimally coupled to the metric. This 
frame has the advantage of being mathematically 'close' to GR, but has the disadvantage that matter does not follow 
the geodesies of g^ v and the energy-momentum tensor is not covariantly conserved with respect to this metric. The 
frame picked out by g^ is the Jordan frame, where matter moves along geodesies but the gravitational action is not 
of Einstein-Hilbert form. Both frames are equivalent in the sense that observables calculated in either will be the 
same. From now on we will assume that all matter fields couple to the metric with a universal coupling a((j>). 

We describe linear perturbations in the conformal Newtonian gauge. For scalar modes, the field equations reduce 
to four independent equations involving the gravitational potentials $ and the fractional density perturbation 5, 
the velocity potential v (or equivalently the momentum density q), and the anisotropic stress EL The anisotropic 
stress is negligible in the late Universe, but we include it in numerical work for consistency. In the Jordan frame, 
energy-momentum is conserved and so the continuity and Euler equations retain their forms in GR. In Fourier space, 
for pressure- free matter, they become 

5 + kv-3<f> = 0, v + Hv-kV = 0, (29) 
where the dots denote derivatives with respect to conformal time r\. We use the Fourier convention 

*(x,?7) = J ^3^*(k,„)e** l (30) 

and, recall, vQc, rf) is defined through v = — ikv. This leaves two independent equations that may receive modifications. 
It is convenient to take these to be the Poisson equation, and the relationship between the two potentials. In many 
models, including scalar-tensor theories, the modifications on sub-Hubble scales (i.e. the quasi-static limit where 
temporal derivatives are small compared to spatial derivatives) take the form 

-fc 2 * = kivGa 2 pt{a,k)pA, (31) 
$ = 7(a,fc)*, (32) 

where A is the comoving density perturbation A = 6 + 37iv / k and p is the background matter density. Equations (|3l| ) 
and ( [32] ) may be taken to define the functions \x and 7 which are generally functions of time and scale; in GR, /i = 7 = 1. 



Note that when the quasi-static limit is not obtained, we may still define \i and 7 through Eqs(31) and (|32| ) but these 
functions will now generally depend on the initial conditions. Note that, for later convenience, we have written the 
modified Poisson equation in terms of $ whereas it involves $ in GR. 

It is straightforward to show that, despite the modifications in Eqs (|3l]) and (|32"|), the comoving curvature pertur- 
bation 1Z = — $ — Hv/k is conserved on super-Hubble scales as it is in GR. Specifically, we find 



H V aH 



*-(s)V + »>+si^(*-w."«a]'»), < 33 > 



where H = H/a is the Hubble parameter and primes denote derivatives with respect to In a. Since 1Z ~ $, the 
comoving curvature perturbation is conserved on super-Hubble scales provided /i and 7 tend to scale-free functions 
there. Constancy of the comoving curvature perturbation can be shown to hold very generally for metric theories of 
gravity in which energy- momentum is conserved [ |39| |4p[| . Differentiating "Rf = gives a second-order equation for the 
potentials which in our notation is 

$" + *'-^$'+f^-CV = ° (fc«atf). (34) 



Note that, once the relation between $ and ^ is specified, the evolution of the metric perturbations on super-Hubble 
scales is determined by the background expansion history For matter-dominated expansion, H' / H = H" / H' and 
$ and \1/ are constant on large scales if 7 is time-independent. We are, of course, still free to modify the relationship 
between ^ and A; for example, fi > 1 on large scales with constant 7 would actually suppress large-scale clustering 
in A during matter-dominated expansion, despite increasing p giving a larger effective Newton's constant. 
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On sub-horizon scales, we can rewrite the set of perturbation equations as second-order evolution equations for the 
density and for the velocity. Neglecting the $ term in the continuity equation and replacing A with 5 in the Poisson 
equation, we find 

5 + H5 -4irGa 2 pp5 = 0, (35) 
2U - v + (n + V. 2 - - 4irGa 2 ^p\ v = 0. (36) 

We see that the evolution of 5 and v on sub-horizon scales is completely determined by /i. The density and redshift- 
space distortion terms in the 21 cm brightness temperature are therefore insensitive on small scales to modifications 
in the 7 function. However they are expected to deliver stringent constraints on the time and scale dependence of p. 
In particular, since the evolution of the velocity is sensitive to both p and its time derivative p, it is strongly affected 
by sharp variation in p. 

To understand qualitatively the effect of p on the growth of perturbations on sub-Hubble scales, consider a toy-model 



with constant p and matter-dominated expansion. Equation (35) then becomes 



6 + -S -%5 = 0, (37) 

which has power-law solutions S cx a p± with p± = (— 1 ± -y/1 + 24^)/4. As expected, p > 1 enhances clustering and 5 
grows faster than in GR (where S oc a) and the gravitational potential wells of ^ deepen in time. 

Although the above solutions were derived under the simplifying assumptions of a constant p and sub-horizon scales 
during matter domination, the general effect of p > 1 on the clustering of matter is clear: a larger effective Newton's 
constant boosts the strength of gravity which enhances the degree of clustering on sub-Hubble scales, thus allowing 
the gravitational potentials to grow with time. The time-dependence of ^ and $ can source an ISW effect in the 
CMB temperature anisotropy, and alter the strength of gravitational lensing effects as a function of redshift. 

Since the models we consider attempt to explain the late-time acceleration of the universe, they only affect the 
linear dynamics at late times when the acceleration sets in 9 . The CMB is thus only sensitive to modified gravity at 
the perturbativc level via the large-scale (late-time) ISW effect, and the CMB weak lensing signal. Both of these are 
sensitive to the Weyl potential, which is equal to ($ + \E')/2 in the Newtonian gauge. The Weyl potential is related 
to the matter density by 

- k 2 {<P + *) = 4irGa 2 p{l + 7)pA = 87rGa 2 E(a, k)pA, (38) 
where we have defined the new parameter 

E = |(l+ 7 ). (39) 

We will work in the (p, E) basis since the 21 cm signal is primarily sensitive to p and the CMB is primarily sensitive 
to E, although the CMB is also sensitive to p through the enhanced growth of A it induces. 

Under fairly weak assumptions about the coupling function a(cj>) jy], an action of the form of Eq. ( p7| ) leads to the 
Bertschinger-Zukin forms of /1 and 7 in the Jordan frame given by 

^ k)= 1 + A|k* ■ (41) 

where j3\ and p% are coupling parameters, Ai and A2 are Compton wavelengths at the present epoch (a = 1) with s a 
constant describing their time evolution. For scalar-tensor theories, only two of and Ai are independent jll| . This 
form for fj, and 7 has the advantage of being physically transparent; on scales much larger than the Compton wavelength 
of the scalar field fj, and 7 obtain their GR forms, and on scales below the Compton wavelength the modifications are 



9 There are exceptions, for example f(R) models with d 2 //di? 2 < have unstable perturbations at high curvature for a standard expansion 
history ^j. 
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proportional to the coupling parameters /3{. In addition, for s > 0, the comoving Compton wavelength grows with 
time, such that the influence of the scalar field perturbation can be confined to late times if desired. 

The disadvantage of using the parametrizations in Eqs (40) and (|4l]) is that their domain of applicability is limited 
to scalar-tensor theories, or theories conformally equivalent. A more general, mo del- independent approach is desirable. 
The works of Baker et al. Q provide just such a mo del- independent formalism, but the large number of free 
functions present in their formalism is undesirable for our simple application here. 

We argue that the most interesting theory that may be probed by upcoming large-scale structure surveys is the null 
theory, i.e. we seek to test whether GR is correct on cosmological scales. To establish the veracity of this statement 
is a far more interesting challenge than choosing between the many alternative theories on offer, few of which have a 
compelling observational justification for pursuit. To this end, we will perform a principal component analysis (PCA) 
of the functions /x and E. Since GR predicts that these functions are equal to unity at all times and (linear) scales, any 
evidence to the contrary would be particularly interesting. A study of the well-constrained functional forms provides 
a model- independent way of forecasting future constraints 10 . Wc outline the details of our PCA procedure in Sec. |y|. 

We note in closing that the parametrization of Eqs (|o|) and ( pil] ) is particularly ill-suited to inference when the 
maximum-likelihood point is close to GR. In this case, one of the parameters (f3i,\i) is unconstrained. To see this, 
consider fixing the coupling to zero. Then, it does not matter what value we pick for the Compton wavelength or its 
time dependence since we always recover GR. Similarly, if the Compton wavelength is unobservably small, we expect 
no constraint on the coupling or time dependence, since the effects of the scalar field are never felt at observable scales. 
The posterior is then very non-Gaussian in these parameters around the GR model and an error analysis based on 
the Fisher matrix will be wrong. A better approach would be to use some related parametrization which uses specific 
values of /i and 7 at predefined scales and times, which should have a much more Gaussian posterior. One could then 
map the Fisher constraints onto non-Gaussian distributions for the Ai) parameters. 



IV. INTENSITY MAPPING SURVEY ASSUMPTIONS 



In this section we provide details of the window functions and survey parameters of our assumed 21 cm experiment, 
and the astrophysical bias. Throughout, we use a Gaussian frequency-space window function normalized to unity 
with a fixed bandwidth of 2 MHz. We consider the frequency range 400-800 MHz corresponding to z — 0.7-2.5. The 
corresponding radial resolution is 12 Mpc at z — 0.7 and 34 Mpc at z = 2.5. Smaller bandwidths are possible and may 
improve our constraints somewhat since at the upper end of the redshift range we include linear modes short enough 
to be suppressed by integration through the window (see Sec. ^). However, using a narrower window requires finer 
sampling and increases the run-time of the code significantly. 

We take a scale-independent bias of b = 2, which is consistent with the bias of DLAs recently derived from cross- 
correlation with the Lya forest ^5|. We further take the bias to be independent of redshift. In our forecasts, we later 
treat the bias as a free parameter and marginalise over it, arguing that its exact numerical value is not important. 

For our experimental setup, we consider a small-scale interferometer consisting of an array of dipole antennae. We 
take the dimensionful thermal noise angular power spectrum to be white with J46| 



1(1 + = T s 2 ys (2^) 2 1(1 + 1) 

2-7T AlA f 2 I 2 

ni\ """-o^covcr 'max 



(42) 



and the noise uncorrelated between frequency bins. Here, T sys is the system temperature of the antennae, Av is 
the bandwidth, t a is the total observing time, / COV cr is the ratio of the total collecting area to the geometric area of 
the array, and / max is the maximum multipole given by 2?tD / A where A is the observing wavelength and D is the 
diameter of the array. The quantity /cover is sometimes called the array filling factor in the literature, and is closely 
related to the aperture efficiency of a single-dish experiment. We set / CO vor = 1 for simplicity, since its precise value 
depends on the geometry of the experiment. The system temperature is given by T sys = T ant + T s k y where T ant is 
the antenna temperature and T s k y is the angle-averaged sky temperature due to foreground contamination. At the 
low redshifts considered here, the system temperature is dominated by the antenna temperature, which we take to be 
40 K, a reasonable estimate given current technology [[H]. For the sky temperature we take 

-2.6 



t ^- 5 -°{tiomhJ k (43) 



However, since the number of PCA nodes is finite, we are still vulnerable to theories which may mimic GR at the nodes, but the use of 
a fine enough pixclization should mitigate this risk considerably. 
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TABLE I: Experimental parameters adopted for our forecasts; see text for definitions. 



Parameter Value 



^max 


835 MHz 


^min 


406 MHz 


Av 


2 MHz 


D 


100 m 


/cover 


1.0 


to 


lyr 


Tant 


40 K 


/sky 


0.5 



appropriate for synchrotron radiation. We note that Eq. fl42| ) has been derived under the assumption that the array 
uniformly covers a region of Fourier space of area 7r^ ax as the Earth rotates; for refinements accounting for non- 
uniformity in the array see e.g. J46| . 

We assume a total integration time of one year, and assume the array has a diameter of D = 100 m, corresponding 
to something roughly the size of CHIME [18 . In addition, we assume the survey covers a fraction of the sky / s k y = 0.5. 



Our experimental parameters are summarized in Table Q. 

Since the sources of the 21 cm signal are discrete, the measured auto-spectra have shot noise contributions as well 
as the clustering signal. For a redshift window with an angular density of sources N(z), we include a dimensionless 
shot noise power given by Cf hot — 1/N(z). We assume a comoving number density of sources of 0.03 h 3 Mpc~ 3 
following p4| . 

We further assume that no modes are used with multipole greater than some z-dependent cut-off to avoid 

issues with modelling non-linear scales (see Sec. |v|). In Fig. || we plot the dimensionless 21 cm auto-spectra and noise 
power spectra at z — 2.5 and z = 1.0. In each case, the greatest multipole plotted is the cut-off l^™(z). 

We further split the frequency range into iV W i n 2 MHz frequency windows. The choice of A^ W j n and the spacing of the 
windows is a subtle issue dependent to some extent on what the experiment is trying to measure. Ideally, we should 
include all (y max - ^ m i n )/Af windows to retain all information. However, the number of power spectra increases with 
the number of windows as A^ in and calculating all of these is computationally prohibitive for the forecasts in this 
paper. This problem would be exacerbated in a full parameter analysis where the spectra would have to be computed 
O(10 5 ) times across the parameter space. Of course, the observed signal will be highly correlated for sufficiently small 
radial separations. Fine sampling on radial scales smaller than this correlation length will not improve parameter 
constraints significantly in models with power spectra that vary slowly in time if the measurements in individual 
windows are sample-variance limited. Note that for 21 cm cosmology, on large angular scales the radial correlation 
length is rather smaller than x( z )/l since shorter modes contribute significantly there. This means that even if one 
is interested in a parameter that only affects large scales, it is advantageous to space windows more closely than this 
to beat down the cosmic variance of the smaller-scale modes. An alternative approach is to transform to a different 
radial basis to remove the correlations. For example, if we were simply observing projections of a time-independent 
scalar field, a spherical-Bcsscl transform in the radial direction (adopting a fiducial cosmology to convert redshift to 
distance) would yield approximately uncorrelated spherical multipoles. Such an approach is advocated in for 
large-scale cosmic shear surveys. Note that galaxy redshift surveys circumvent the issue on small scales by breaking 
the survey volume up into sub-volumes and estimating the (anisotropic) 3D power spectrum independently in each 
volume. 

A full exploration of these issues is beyond the scope of this paper. Instead, for most of our forecasts we simply 
take iV w i n = 20 windows equally spaced across the observable frequency range. As we discuss in Sec. VA and VB, 
we expect our qualitative results to be stable to increases in N w { n and to changes in their location. However, in our 
PCA of general models we do expect that the constraints on the amplitudes of individual modes could be improved 
though not their general form or the number of well-determined modes. 



V. FORECASTING METHODOLOGY AND RESULTS 



In this section we present the details of our statistical analysis and forecasted constraints on the parameters of 
modified gravity. 
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FIG. 3: (Color Online). Dimensionless 21cm brightness temperature auto-spectra (green, upper solid), shot noise power (blue, 
dashed), thermal noise power (red, dot-dashed) and total noise power (black, lower solid) at z = 2.5 (top) and z = 1.0 (bottom), 
for the survey parameters in Table Q. 



We construct a Fisher matrix for the parameters 6i (seven cosmological + N modified gravity), for an upcoming 
21 cm intensity mapping experiment with noise spectrum given by Eq. ( fl2| ) and survey parameters from Table ||. In 
addition we include CMB (unlensed) temperature, _B-mode polarization and a reconstruction of the lensing deflection 
field d from the CMB. We adopt Planck-like noise levels and and angular resolutions taken from fig] , and retain 
multipoles up to Z™ B = 2000. Statistical noise on the lensing deflection reconstruction is computed using the 
optimal quadratic estimator of 
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The Fisher matrix for the parameters 0, is defined as the ensemble average of the Hessian matrix of the log-likelihood, 
and, assuming Gaussian fields with covariance C, is given by 



Fij = ^Tr 



2 



c _i3C c -i£>C 



'max anxx' BC YY ' 

[Cov(XX',YY% 1 ^—, (44) 



1=1 XX' YV 1 



where Cov(XX' , YY') is the covariance of the power spectra estimators, including noise, and XX' and YY' stand for 
combinations of the observed fields (TT, TE, Ed etc.). For the specific form of Cov(XX' , YY'), see for example [p0| . 
Note that we include all the cross-correlations between the CMB and 21cm fields, which potentially carry a lot of 
information about late-time phenomena (see ]5lJ for a discussion in the context of galaxy surveys). The non-zero 
cross-correlation between the low redshift 21 cm signal and the (unlensed) CMB temperature anisotropy is due to the 
late-time ISW effect in the CMB 11 , whilst the 21 cm-polarization correlation arises since the last-scattering surface 
of an electron at reionization has some overlap with the radial distances of relevance for the 21cm signal [|52|. The 
latter correlation is small, but grows with redshift and we include it for consistency. The inverse of the Fisher matrix 
gives the covariance matrix between the parameters and its diagonal elements give the la marginalised errors on 
parameters. 

We vary seven cosmological parameters, (fi&/i 2 , fl c h 2 , h, r, A s ,n s , Q^h 2 ), and assume three degenerate neutrinos at 
the minimum mass of the normal hierarchy (Q^h 2 m 0.0006). We fix the curvature to be zero, and the expansion 
history to be that of ACDM (plus massive neutrinos) , as favoured by a variety of cosmological probes ]53| . 

To avoid the issues of modelling non-linear structure formation in modified gravity, for the 21cm signal we retain 
only multipoles up to ^ax"- Whilst the comoving non- linear scale at the present epoch in GR is roughly at /cnl ~ 
0.1 /iMpc -1 , in modified gravity it can be rather larger. For example, TV-body simulations of f(R) models in H] show 
that linear physics is only recovered for k < 0.06 ftMpc -1 at z = 0. To account for these effects, we set l 2 ^^ = 150 for 
z < 1.5, corresponding to kfl L cm 0.05-0.08 /iMpc" 1 , which roughly matches the non-linear scales found in (54) 12 . 
For z > 1.5, we set ^ a " n = 500, corresponding to fc 2 ^ " 1 ~ 0.12-0.15 ft-Mpc -1 , since the comoving non-linear scale is 
still expected to grow with time in modified gravity models. We ignore non-linearities in the CMB lensing deflection 
power spectrum since, at Planck noise levels, the signal-to-noise on the lensing reconstruction is expected to be very 
low on non-linear scales. 

To include the effects of modified gravity in CAMB SOURCES, we use the MGCAMB module adapted for 
compatibility with CAMB SOURCES. We also update the sub-horizon radiation approximations to those of the full 
CAMB release. The May 2011 release of CAMB SOURCES uses the old sub-horizon approximations of but this 
formalism implicitly assumes GR. We update this part of the code to use the newer formalism of [p7|, which can 
be easily generalised to include modified gravity. We also incorporate the effects of anisotropic stress from massive 
neutrinos consistently in MGCAMB, although the effects are negligible at late times. 



A. Constraints on f(R) gravity 

We start by considering the constraints that 21cm intensity mapping can place on f(R) theories. In this section, 
we fix the bias to a constant value of b — 2.0, consistent with the constraints from P5| . Note that to satisfy local 
tests of gravity, the f{R) Compton wavelength A at present cosmological density is restricted to A < Mpc |58|, p9[ . 
This limits departures from GR in f(R) theories consistent with Solar System tests to non-linear scales. However, 
here we are not interested in constraining f(R) theories per se, but rather in understanding which kind of constraints 
21cm observations could place on an (effective) theory that behaves like f(R) gravity at cosmological scales, without 
necessarily satisfying local tests. 

We consider the i?o-parametrization of f(R) gravity [p0[, which provides a good approximation on quasi-static scales 



5l|. This is closely related to the scalar-tensor type parametrization discussed in Sec. HI, since f(R) is conformally 



11 Note that the 21cm signal also contains an ISW effect [see Eq. (|14|)] that is perfectly correlated with the ISW contribution to the CMB 
for z > 2. However the amplitude of this correlation is subdominant with respect to the 21 cm density-ISW cross-correlation. 

12 Note that we fix the maximum I rather than the maximum k. 
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equivalent to a scalar-tensor theory. The specific forms of /j, and 7 we consider are 

1 l+4A 2 fc 2 a 4 /3 

M °' ' l-1.4x 10- 8 |A/Mpc| 2 a 3 1 + A 2 fc 2 a 4 ' ( ' 

1 + 2A 2 fc 2 a 4 /3 

7(a ' fc) = 1 + 4AW/3 ' (46) 

where the parameter Bq = 2if 2 A 2 is the square of the Compton wavelength of the effective scalar degree of freedom 
fn = df/dR (the scalaron) in units of the current Hubble length. The prefactor in Eq. ([I||) accounts for the time- 
dependence of Newton's constant in the Jordan-frame background equations, but is practically unity for the models 
we consider. It is then easy to show that £ « 1 in this theory, so the relationship of the Weyl potential to the density 
is the same as in GR (although the density is of course different). 

Our parameter space thus consists of the seven cosmological parameters, plus Bq. We take GR as a fiducial model, 
i.e. Bq = and adopt 20 window functions equally spaced in frequency. Our la errors are Bq are 1.9 x 10~ 4 using 
only the CMB information, improving to 3.7 x 10 -5 with the addition of 21 cm. 

We have marginalised over all other parameters, but there is only mild degradation in the constraint as a result 
of this. The main covariance of Bq is with Vt u h? (correlation coefficient of 0.4). Increasing Bq pushes the scalaron 
Compton wavelength to larger scales, such that a greater range of (small) linear scales experience enhanced growth 
and for longer. This boosts the 21 cm signal. The main effect of light massive neutrinos on the 21 cm power spectrum 
is a suppression of power on scales below the horizon size at the redshift where the neutrinos become non-relativistic. 
The suppression is scale-dependent for scales between the free-streaming scale at the transition and the free-streaming 
scale at the 21cm redshift, with smaller scales exceeding the free-streaming length later and hence suppressing the 
growth of structure for longer |30|. On scales smaller than the free-streaming length at the 21cm redshift the 
suppression is scale-free, and for scales larger than the free-streaming length at the non-relativistic transition there is 
no suppression at all. This small-scale suppression can partially cancel the enhancement from increasing Bq, giving a 
positive correlation. 

We see that the addition of information from 21 cm intensity mapping improves the constraint on Bq by about 
a factor of five over the CMB-alone case. Although this is not competitive with local constraints or forecasted 



constraints for the Dark Energy Survey or Large Synoptic Survey Telescope 55 , it demonstrates the potential 
usefulness of intensity mapping in constraining f(R) theories. It is interesting to note that our constraints are 
comparable with those in p4| , who considered only the baryon acoustic oscillation signature extracted from 21cm 
maps, lens reconstruction from 21 cm itself and Planck-like priors, when only linear scales are included. To map from 
their parametrization to ours, note that Bq w — 2{n + l)/^ /(4 — 3f2 m ) where the power-law n appears in the Hu 
& Sawicki model of f(R) pQ] that they employ. Current cosmological constraints from |62) limit Bq < 1.1 x 10~ 3 
at 95% confidence, although this constraint is driven mainly by cluster abundance data where non-linear chameleon 
effects not accounted for in |p^| could be important. Excluding the cluster abundance data, |62[ quote an upper limit 
of Bq < 0.42 at 95% confidence. Our 95% upper limit is 7.4 x 10~ 5 , which represents an improvement on this current 
constraint by roughly four orders of magnitude. 

What limits the power of intensity mapping in constraining Bq! The main effect of Bq on the 21 cm power spectrum 
is an enhancement of power on scales below the Compton wavelength, with the enhancement increasing with time. 
Thus, setting aside the influence of the CMB for now, we may conjecture that the upper limit set on Bq from 21cm 
data derives from the shortest observable scale at the lowest observable redshift. In our case, this is the non-linear 
scale 13 , which at z = 0.7 is roughly 0.08 /iMpc~ . The value of Bq which sets the comoving Compton wavelength 
equal to this scale at z = 0.7 is around 3 x 10 -4 . This is only a rough upper limit, since /j, and 7 are not step functions, 
and even scales above the Compton wavelength experience some modified growth. We can therefore anticipate that 
as our understanding of modified-gravity scenarios in the mildly non-linear regime of structure formation improves, 
we will be able to exploit 21cm (and other cosmological observables) on smaller scales thus enhancing our ability to 
constrain scalar-tensor theories and f(R). 

With some experimentation, we find that our results are stable to changes in the number of windows around 
iV w i n = 20, equally spaced across the observable frequency range. For the Bq constraints, most of the signal comes 
from the lowest observed redshifts, so adding more bins does not impact the constraints significantly. Our results are 
also stable to moving all 20 bins to low redshift, with a radial separation corresponding to the distance at which the 
cross-correlation with the z = 0.7 window at I = 150 first goes to zero. 



13 Note that our choice for the non-linear scale is more conservative than in . This may partly explain their stronger constraint on Bq. 
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To understand which part of the 21cm signal is most powerful in constraining Bq, we repeat the analysis with the 
density term switched off, in both cosmic variance and the signal. This is equivalent to what could be achieved if the 
redshift-space distortion signal could be extracted coherently in the 21 cm maps. While not possible for 21 cm intensity 
mapping, we note that such extraction is possible if the clustering of differently biased populations is combined [|63| . 
We find that the marginalised error on Bq decreases by roughly 60%, suggesting that redshift-space distortions are 
significantly more sensitive to Bq than the density, and that the information we can extract from redshift-space 
distortions is limited by the cosmic variance of the dominant density term. 

On the CMB side, one question we can answer quite easily is whether it is CMB lensing or the ISW which is more 
powerful in constraining Bq in our combination of observables. To investigate this, we run the above analysis again but 
with the late ISW term set to zero. We find that the marginalised error on B increases by roughly 0.5%, indicating 
that the information about modified gravity brought by the late ISW effect is sub-dominant to that brought by CMB 
lensing. This is likely due to large cosmic variance present in the CMB at the large angular scales where the late-time 
ISW effect is significant. 

B. Principal component analysis 

Whilst specific parametrizations of the modified gravity sector such as Eqs ( f!o| ) and ( pl|) have the advantage of 
being physically transparent and easily related to the underlying theory, they are too restrictive in scope and fail 
to describe the general space of theories. To test GR on cosmological scales in a more theory-agnostic manner, a 
model-independent approach is required whereby one considers rather general functional forms for /i and 7. Such an 
approach is provided by the machinery of PCA. The basic idea is to establish what functional forms of /z(a, k) and 
7(0, k) can be independently constrained well by the data. For applications of PCA in testing modified gravity, see 

e.g. §| g. 

The first step in a PCA is the pixelization of the free functions, ideally at fine enough resolution that the well- 
determined modes coincide with what would be obtained from a continuum analysis. In practice, for this forecasting 
analysis, computational resources limit us to consider 4x4 grids for both /i and 7, equally spaced in both comoving 
wavenumber and conformal time. Our range of k is 0.00 < k < 0.153 ftMpc -1 , and our redshift range is < z < 3. 
The upper limit on z is arbitrary, but we only wish to consider models where modified gravity mimics dark energy, 
which motivates a choice of upper limit that matches the onset of acceleration. We enforce GR for z > 3. The upper 
limit on k corresponds to the smallest observable scale in the 21 cm signal. For k > 0.153 /iMpc -1 we set our functions 
equal to their values at k = 0.153 /iMpc" 1 . The choice of grid resolution imposes a prior on how rapidly we allow the 
modified-gravity functions to vary in scale and time. 

We thus parametrize departures of fi and 7 from their GR form (unity) with 16 nodal values each. We use a 
bicubic spline to interpolate between these nodes, setting the derivatives to zero at the edges of the grid. We also 
experimented with bins rather than nodes, in the manner of p5| |64| ], but found that the steep variation of /1 and 7 
at the bin edges caused a loss in numerical accuracy. We transform our Fisher matrix into the (/i, S) basis, since 
conditional information from £ comes almost completely from the CMB alone, whereas information about fi comes 
from both 21cm and CMB fields. 



1. Fixed bias, b — 2 

We first report results of the PCA assuming the bias b = 2 is known perfectly. In this case, our modified-gravity 
sector consists of 32 parameters and our total parameter space has a dimension of 39. We construct the Fisher 
matrix for all these parameters, and marginalise over the seven cosmological parameters by removing the appropriate 
rows and columns from the full covariance matrix. We then proceed to diagonalise this sub-matrix, which gives 
us 32 eigenvectors and associated eigenvalues. Each eigenvector is normalized to unity, and gives an independently 
constrained mode of and 7 for our observables. The la marginalised errors in the determination of the amplitude 
of each of these normalised modes is given by the square-root of the corresponding eigenvalue. We plot these errors 
in ascending order in Fig. ^. 

Since the eigenmodes of the PCA refer to departures from GR, and so have fiducial value zero, there is no unam- 
biguous definition of a 'well-constrained' set of principal components. However, the observations only provide useful 
information on those modes for which a la fluctuation in amplitude corresponds to changes in /1 and £ which are not 
large compared to unity. Where this is not the case, physicality priors, such as /x > 0, would constrain the amplitude 
of the mode much more strongly than the data. Moreover, it is clear that those modes for which a la fluctuation in 
amplitude produce large changes in /x and £ do not give a precise test of GR. Since the eigenvectors are normalised 
to unity in a 32-dimcnsional space, the typical eigenvector will have at least one component that is 0(1) and so we 
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FIG. 4: (Color Online). The la errors on the amplitude of each mode of the PCA (i.e. the square root of each eigenvalue) from 
the CMB alone (blue, upper points) and CMB and 21cm (green, lower points). 



regard eigenvalues less than one as corresponding to well-constrained modes . On this measure, our combination of 
observables can constrain well around 22 modes of modifications to gravity. The discernible upturn in the spectrum 
of eigenvalues in Fig. |I] for the 21 cm-plus-CMB combination suggests that we have reached the saturation point of 
the PCA, in that the data would not be informative about the additional modes that would result from increasing 
the number of grid points. The upturn is at an O(l) eigenvalue which makes our enumeration of the well-determined 
modes reasonably robust to changes in the threshold eigenvalue. We see further from Fig. || that adding 21 cm in- 
formation to the CMB observables increases the number of well-constrained modes considerably, as well as reducing 
the overall errors on these modes. This is due to the information on scale and time-dependence of /x brought by the 
21cm signal, as well as degeneracy breaking effects in the combination of CMB and 21cm, which we discuss below. 

For each eigenvector, we can use its components in the (p, E) basis to construct an 'eigensurface' in the (fc, 77) 
plane for each function, using bicubic splines to interpolate between the nodal values; see Fig. [| for the first eight 
well-constrained modes and the two poorest-constrained modes. Such plots give a visual impression of the 'sweet 
spots' of fj, and E probed by our particular observables. The poorest-constrained modes are localised to late times 
and small scales where our observables have little sensitivity. Note that we do not plot marginal constraints on either 
one of /i or E alone. We would expect /i and E to depend on functions or parameters more fundamental to the 
underlying theory, as is the case in scalar-tensor theories. It would thus be misleading to reduce the parameter space 
by marginalising over either of /i or E (as well as the cosmological parameters). 

We see from Fig. |] that the first two eigenmodes are dominated by E, both showing a single broad oscillation in 
the temporal direction (which is over-sampled by the grid) and confined to large scales. Although the 21 cm signal 
carries no information about E on sub-horizon scales, the information it brings on (i is complementary to that of 
the CMB, and breaks some of the degeneracies between /i and E present in the CMB. The source of this degeneracy 
can be understood roughly as follows: Increasing fi enhances the density perturbation on sub-horizon scales, which 
then increases the depth of the Weyl potential via the Poisson equation. This can be compensated by decreasing E, 
although the cancellation is not exact since the response of the density to fi is non-local in time. Since the CMB 
probes modified gravity only through the Weyl potential, there is an approximate degeneracy. Thus, the shapes of 
the E eigensurfaces are not the same as those that use only the CMB as an observable. 

The primary sensitivity is thus to temporal oscillations in E, and variations in scale in /i. In Figs ^| and ^ we plot 
the fractional difference of the resulting power spectra from GR for the first and third eigenmodes with amplitude of 
lcr. By construction, for both modes plotted the resulting change in the marginal log-likelihood is —1/2. Note that 



Note that this conclusion would change if we used many more grid-points. The best determined modes are reasonably well sampled by 
our grid (see below) and so we expect these would not change significantly with greater grid resolution. However, the components would 
reduce in amplitude, to preserve the normalisation, and the eigenvalues increase to preserve the form of /i and £ for an amplitude of 
lcr. The threshold eimmvalue for well-determined modes would thus increase with grid resolution. This makes direct comparisons with 
other work, such as fcEJ which considered future cosmic shear and galaxy surveys, difficult since the dimensionality of the grids differ. 
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FIG. 5: (Color Online). Left: The fj,(r),k) eigensurfaces of the eight most well-constrained and two poorest-constrained 
eigenmodes for 21 cm-plus-CMB observables in the (rj, k) plane. The eigenvectors have been normalised to unity, and the 
surfaces are bicubic splines in between the nodes of the PCA. Right: The same, but for E(ry, fc). Note that the overall 
normalisation is fixed to unity across \x and £ but the sign of each eigenvector is arbitrary. 



all differences shown are well within cosmic variance and experimental noise, but the broad-band features allow us to 
average over many modes and get a constraint. 

We see from Figure ^ that temporal oscillations in E source a large ISW effect in the CMB, since the Weyl potential 
has a significant time derivative. There is also a lensing contribution, since the oscillations are broad enough that the 
CMB lensing kernel changes significantly over their period. The information on scale-dependence brought by lensing 
is of course limited by the Planck reconstruction noise. 

To understand the shape of the lensing plot in Figure ^, consider the power spectrum of the lensing deflection d in 
the Limber limit, which should be accurate on small angular scales ]65| : 

Cf«— / d x V { ^ )/2 (l/ X ;Vo-x) [X * 2 X> , (47) 
' Jo X*X 

where "P(*+$)/2 is the dimensionless power spectrum of the Weyl potential, and x* is the conformal distance to 
recombination. The integral is taken along hyperbolae of constant I = fc(?7o — rf) in the (rj, k) plane. We focus on the 
most well-constrained eigenmode, since this is mostly E with a broad oscillation in the time direction, which simplifies 
the discussion. In Fig. || we plot the E part of the eigensurface for this mode, with curves of constant I overlaid. 
Deviations from GR occur in the lensing signal because as we sum up the contributions along these curves in evaluating 
the integral in Eq. (fl7|), we pass through regions where E does not obtain its GR value. The relative importance of 
these regions in the above integral depends on the magnitude of the deviation in E, the radial distance through the 
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FIG. 6: (Color Online). Fractional difference from GR in the CMB temperature power spectrum (top) and CMB lensing deflec- 
tion power spectrum (bottom) for la changes in the first (green, solid) and third (blue, dashed) best-determined eigenmodes 
of the CMB-plus-21 cm combination. 



region, and the product of the time-dependent geometric lensing kernel in Eq. (47) and the scale- and time-dependent 
value of 7 : '(^+$)/2(fc; tj) in GR. The geometric kernel favours late times and the power spectrum favours large scales 
and early times. As an example, consider I = 300. The corresponding hyperbola in Fig. ^ passes through a region 
where E < 1 , and a region at lower \ where E > 1 . The geometric lensing kernel up- weights the patch at lower \ 
thus raising Cf d above its GR value, consistent with Fig. g. As we decrease I, we start to pick up the large region 
with E < 1 at late times at the bottom of Fig. |[ as well as the large region in the top left corner at large scales with 
E < 1. This reduces Cf d below its GR value giving the negative feature in Fig. ^ around I ~ 100. At even lower I, we 
start to accumulate relatively more of the 'positive' part of the E oscillation, where E > 1. This drives the upturn in 
Fig. at low I, although the Limber approximation become increasingly poor on these large angular scales. 

In contrast, most of the well-constrained [i modes exhibit oscillations in scale, with the third most well-constrained 
mode having a single broad feature in the (k, ij) plane. As has been mentioned, CMB lensing brings limited scale 
information due to reconstruction noise, and the late-time ISW effect only affects the large-scale CMB where cosmic 
variance is large. However, the 21cm signal can give excellent scale information, as well as time-dependence from 
the tomography. Figure [?] shows the fractional difference from GR of this signal at low and high redshifts. The 
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FIG. 7: (Color Online). As Fig. g but for the 21 cm power spectrum at z — 2.5 (top) and z — 0.7 (bottom). 

well-constrained [i modes thus come mainly from the 21 cm signal, which can be confirmed by running the PCA using 
only 21 cm as an observable. In that case, we find the same fc-space oscillations, although the overall constraints on 
modified gravity are poor due to degeneracies with the standard parameters which are not well constrained in the 
absence of CMB information. 

The 21 cm power spectrum for the best-constrained eigenmode has an excess over GR on large angular scales at 
high redshift, with a magnitude larger than for the third-best constrained; see Fig. This may seem surprising at 
first, since the ^(77, k) eigensurface for the best-constrained mode is low amplitude compared to the third mode and 
negative. However, the effects of £ are non-negligible on horizon-scale modes in the 21 cm signal, and these purely 
relativistic effects drive the large deviation from GR at low I for this mode. At I = 2, roughly 75% of this deviation 
can be attributed to the new terms in Eq. (fl7] ) and their cross-correlation with the standard terms, the dominant 
contribution coming from the potential terms. The remaining 25% comes from non-negligible relativistic terms in the 
dynamical equations for the density and velocity. 

To understand which term in the 21 cm signal contributes most to the [i constraints, we repeat the PCA without 
the density contribution. The number of principal components with eigenvalues below unity then decreases to 10. 
However, if we perform the PCA with bins (as in |55|), this number rises to 26, above that obtained with nodes. 
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FIG. 8: (Color Online). The £(??, k) eigensurface of the most well-constrained eigenmode for 21 cm-plus-CMB observables. The 
solid black curves correspond to fixed I = k(r]o — 77) at values of 300 (top), 100 (middle), and 20 (bottom). 

This reflects the fact that the redshift-space distortion term is significantly more sensitive to /1, and particularly 
to its temporal variation /i, than the density term. Hence if there were a way to separate density from redshift- 
space distortion in the 21cm signal, this could significantly improve sensitivity to modifications of gravity, although 
uncertainties in the bias would propagate through to the variance of any density-free estimator. 

To summarise, in models where the bias is constant and fixed, we are able to extract roughly 22-29 well-constrained 
modes from the modified-gravity sector. Our observables are most sensitive to broad radial oscillations in S, and broad 
wavenumber oscillations in /1. Addition of 21cm information dramatically improves the number of well-constrained 
modes, increasing this number from 10 to 22 using the a < 1 threshold reference. The results in this section assume 
N w - m = 20 windows equally spaced in frequency. We speculate that increasing the number of bins could improve 
constraints on the individual mode amplitudes (i.e. reduce their eigenvalues) by as much as a factor of ylO. However, 
we would not expect significant changes in the form of the eigenmodes since the the 20 redshift windows already over- 
sample the fastest temporal variation supported by our grid. Given the sharp upturn in the spectrum of eigenvalues 
in Fig. |^, the number of well-constrained modes should also not change significantly with an increase in iV W i n . As a 
concrete (and computationally tractable) example, running the PCA with 25 equally-spaced windows does not have 
a significant effect on the eigenvalues of the well-constrained modes, but does improve the constraints on some of the 
poorly-constrained modes (although the eigenvalues of these modes still remain well above unity). 

Finally, the PCA provides us with a useful consistency check. We can map /j, and 7 in the Bq parametrization of 
f(R) gravity, Eq. ( fl6| ) , onto the 32 nodal values and hence constrain Bq by transforming the Fisher matrix. Doing 
this gives a la marginalised error on Bq of 3.5 x 10^ 5 , which is within 5% of the value found by using Bq as a 
parameter directly. 

2. Free bias parameter 

Treating the bias as constant and perfectly known is unrealistic. Instead, we should model it with some functional 
form and marginalise over it within some prior range informed by theory and other observations. In this subsection, 
we treat the bias as constant - a reasonable assumption on large scales - but unknown, and marginalise over it in the 
forecast about a fiducial value 6 = 2. We thus expand the dimension of our total PCA parameter space to 40. 

We find that the combination of 21cm and CMB can constrain the bias with a marginalised la error of 0.05, i.e. 
to a precision of 2.5%. As a reference, the cross-correlation of DLAs with the Lyman-a forest from |^5| measures 
b = (2.17 ± 0.20)f3p 22 , where /3p is the Lyman-a forest redshift-space distortion parameter which is expected to 
be above unity. Taking ftp = 1, this represents a 9% measurement of b, but this ignores uncertainty in f3p from 
uncertainty in the cosmology and from modifications to gravity. 

The inclusion of a constant bias as a free parameter has negligible effect on the modified-gravity constraints. The 
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TABLE II: Fractional increase in la marginalised errors due to marginalising over a constant bias factor. 



Parameter er(free &)/<r(fixed b) 

Q b h 2 1.080 

Q c h 2 1.584 

h 1.148 

r 1.000 

As 1.029 

n s 1.137 

Q. v h 2 2.218 



number of well-constrained eigenmodes in the PCA does not change. The main effect of marginalising over b is 
a degradation in the error on Vt v h 2 \ see Table |0], where we list the ratios of marginalised errors on cosmological 
parameters with and without marginalising over bias. The scale dependent suppression of power induced by massive 
neutrinos discussed in Sec. VA can be (very approximately) mimicked by changing the bias. Even though we have 
forced b to be scale-independent, increasing it is not equivalent to a scale-independent amplitude change, since the 
rcdshift-spacc distortion part of the brightness temperature is unaffected by bias and is suppressed relative to the 
density term on small scales due to the width of the window function. 



VI. CONCLUSIONS 



We have shown that a typical near-term low-budget 21 cm intensity mapping experiment will be able to improve 
constraints on modified gravity theories considerably, when combined with CMB observations from Planck. In par- 
ticular, this observable will improve constraints on the Bq parameter of f{R) gravity by a factor of 15 over current 
cosmological constraints ]6^| . 

By performing a principal component analysis on the free functions of the modified-gravity sector /i and £ (or 
alternatively 7) in models with a ACDM expansion history, we estimate that the 21cm experiment, combined with 
Planck, will provide useful information on around 20 modes. This improves on the 10 modes forecast for Planck 
alone, showing that 21 cm intensity mapping has significant additional constraining power in general modifications of 
gravity. Although our results do not seem to be competitive with those forecast in |35f for future galaxy clustering 
and cosmic shear surveys, our analysis is more conservative since we adopt a more realistic non-linear cut-off scale, 
motivated by recent A^-body simulations in f(R) models The 21cm experiment is also much smaller scale 

and cheaper than galaxy imaging and spectroscopic surveys . More generally, the astrophysical and instrumental 
systematic effects are quite different and it is the combination of all such cosmological probes of modified gravity, 
including their cross-correlations, that will allow for accurate and precise constraints. 

By studying the shape of the well-constrained eigenvectors of \i and S in the (77, k) plane, we can identify the 
'sweet-spots' of the CMB-plus-21 cm data combination. We find that the combination of observables is most sensitive 
to broad temporal variations in S, and broad scale variations in /1. The former is probed primarily by the CMB, 
through the late-time ISW effect and gravitational lensing, although 21cm has an important role here in breaking 
degeneracies with \i that exist in a CMB-only analysis. In the models we consider, with only late-time modifications 
to GR, the ISW effect only probes large-scale modifications. Similarly, the scale information available from CMB 
lensing with Planck will be limited by the high statistical noise in the lensing reconstruction, although this will 
improve considerably with ongoing high sensitivity polarization measurements. In contrast, the 21 cm survey provides 
excellent scale information which allows scale-dependent modifications in fi to be well constrained. 

We have also considered the constraints that may be placed on a one-parameter model of modified gravity, specifi- 
cally the Bq parametrization often used to study f(R) models. We find that our combination of observables should 
permit a 95% upper limit of 7 x 10~ 5 on Bq after marginalising over cosmological parameters in flat models with 
ACDM expansion histories. Although this is not competitive with local constraints or forecasts from future weak 
lensing and galaxy surveys, it does suggest that useful constraints from cosmology can be obtained with the kind of 
low-budget experiments considered in this work. 

Although we have considered a specific experimental set-up in this work, we have tried to keep the survey details 
as general as possible. In principle all of our survey parameters could be different, but we would not expect our 
conclusions to change significantly provided that the measurements are signal-dominated at all redshifts on scales 
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where linear perturbation theory is reliable (as they are in our analysis) and a significant fraction of the extra- 
Galactic sky is surveyed. Generally, the ratio of signal to thermal noise is best at low redshifts, and this is also where 
the foregrounds are lowest. Since any redshift bin can only experience the knock-on effects of modified growth that 
occur prior to that epoch, the low redshift bins carry a greater amount of information on modified gravity. It is 
precisely in these bins where the noise from the experiment is smallest. 

One limitation of this work is our restriction to 20 narrow frequency windows equally spaced across the 400- 
800 MHz range for computational reasons. This is certainly lossy, equivalent to throwing away a significant fraction 
of the survey volume. While our Bq constraint is rather stable to increasing the number of windows or using closely 
spaced windows at low redshift, it is not clear how sensitive our PC A results are to the same changes. Since our most 
well-constrained modes exhibit broad features in redshift, we do not expect the form of these modes would change, 
although, generally, we expect the errors on some of the mode amplitudes would improve several-fold. The total 
number of well-constrained modes is expected to be stable. Ways to improve our current implementation include 
using all (u max — i/ m i n )/Av windows but only computing correlations that are expected to be significant, or, better, 
finding an improved radial basis that approximately decorrelates the signals. However, a detailed study of these issues 
is beyond the scope of this work. 

A further extension to this work would be to increase the grid resolution of the PCA. Since adding more frequency 
windows can improve constraints on modes exhibiting grid-scale variations, we may be able to recover more well- 
constrained modes this way. However, it is unclear how physical rapidly varying functions in the modified-gravity 
sector are. We have shown that our adopted grid resolution is sufficiently fine to reproduce f(R) gravity, and we 
speculate that the same would be true for other models. A complete Bayesian analysis would treat grids with 
different resolution as independent models, with the data deciding which model has the highest posterior probability, 
but exploring this issue is also beyond the scope of this work. 

Finally, we have provided a consistent derivation of the perturbed 21cm brightness temperature in linear theory 
in any metric theory of gravity. The new expression we derive includes all line-of-sight and relativistic effects, but 
these extra terms make only a small correction on large scales. The size of the correction increases with redshift and 
width of the frequency windows. We include the new terms in our analysis for numerical consistency, although they 
are negligible for our application to modified gravity. 

Our results show that 21 cm intensity mapping provides an exciting opportunity for testing GR on cosmological 
scales by probing the time and scale dependence of the clustering of neutral hydrogen. Additionally, such surveys allow 
geometric tests of the expansion history through the baryon acoustic oscillation feature and weak lensing tomography, 
at a fraction of the price of other large-scale structure surveys. 
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